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Abstract —Decoding performance of Fountain codes for the 
binary erasure channel (BEC) depends on two aspects. One is the 
essential code structure, on which stopping set analysis operates. 
The other is the effect from the channel characteristic, which is 
difficult to give a precise estimation. To tackle these problems, in 
this paper, we propose a solution to analyzing the performance 
of Fountain codes based on the uncorrectable set. We give the 
condition for Fountain decoding failure over the BEC. Then, 
we conduct the analysis of uncorrectable set on Fountain codes. 
Finally, we combine the stopping set and the uncorrectable set to 
provide the integrated analysis on the performance of Fountain 
codes for BEC. 

Index Terms —fountain codes, stopping set, uncorrectable set. 
I. Introduction 

For binary linear codes, the decoding performance of belief 
propagation (BP)-based iterative decoding is dominated by 
stopping sets over the binary erasure channel (BEC). Stopping 
sets were firstly introduced for the analysis of low-density 
parity-check (LDPC) codes over BECs til. It was shown that 
the iterative decoder failed to decode to a codeword if and 
only if the set of erasure positions was a superset of some 
stopping set in the Tanner graph during decoding. In particular, 
the number and the size of stopping sets is important for 
determining the performance of iterative decoders. Stopping 
sets in a small size for the BEC can lead to small Hamming 
distance. The success of stopping sets in analyzing LDPC 
codes has created a paradigm for researchers to analyze the 
other codes. For example, Rosnes and Ytrehus introduced 
the concept of stopping sets to analyze turbo decoding and 
proposed turbo stopping set 0 . Abdel-Ghaffar and Weber 
derived an equation based on the number of stopping sets 
for a full-rank parity-check matrix of the Hamming code 0. 
Tuvi examined the stopping redundancy Reed-Muller codes 
E). Wadayama presented the stopping set of redundant random 
ensembles a. 

Recently, much attention has been given to a class of 
error-control codes, Fountain codes, due to their excellent 
performance, especially in erasure channels and the simplicity 
of encoding and decoding. 

Three typical examples of rateless codes were developed 
based upon the Fountain codes: Luby Transform (LT) codes 
0, Raptor codes G), and Online codes. As LT codes own the 
basic structure of Fountain code family, many studies on error 
analysis were conducted based on LT codes. 


For instance, the error analysis reported in J8) gave a basic 
result depending on the exact calculation of the error probabil¬ 
ity. The works in |9) and |[K| respectively developed stopping 
criterions so as to detect the earlier decoding termination with 
a lower cost. 

Although the error-control mechanism in Fountain codes fa¬ 
cilitated error analysis, two major factors in Fountain codes on 
the BEC still affect decoding performance. One is the essential 
codes structure, on which stopping set analysis operates. The 
other is the effect from the channel characteristic, which has 
not been effectively resolved yet. Current finite length analysis 
nonetheless still focused on the former problem - the error- 
prone structures of codes. It is much more difficult to give a 
precise estimation of error-prone patterns. 

As Fountain code family belongs to nonsystematic codes, 
which are different from the existing families like LDPC and 
Turbo, the conventional stopping set is not applicable. To 
overcome such a problem, in this study, we focus on the 
performance analysis when output nodes are erased. We intro¬ 
duce uncorrectable set in Fountain codes in order to analyze 
the decoding performance of Fountain codes over the BEC. 
Furthermore, we also provide the concept of uncorrectable set 
and analyze the probability of bit erasure of Fountain codes 
over the BEC in average. 

The rest of this paper is organized as follows. Section II 
briefs the LT code. Section III then describes the Foutain 
uncorrectable set. Next, Section IV shows the probability of 
bit erasure followed by the integrated performance analyze in 
Section V. Conclusions are finally drawn in Section VI. 

II. Preliminaries 
A. Principle of LT codes 

Fountain codes include three typical classes: Luby Trans¬ 
form (LT) codes, Raptor codes, and Online codes. Among 
these, LT codes is the basic to construct other families. LT 
code retains good performance of random linear fountain code, 
while drastically reduces the complexities both in encoding 
and decoding process. During encoding, LT divides the un¬ 
coded message into k blocks with roughly equal length. The 
degree d (1 < d < k) of the next packet is is randomly chosen. 
Accordingly, d input symbols are chosen uniformly at random. 

Let G denote a generation matrix for a length given LT 
code. The encoding can be represented by: 


k 

ti — ^ ^ Xj • Gji ( 1 ) 

i=i 

where n is the code length, k is the length of the input symbol, 
ti denotes the zth of encoded symbol, Xj denotes the yth 
of encoding symbol. Without loss of generality, this paper 
considers the symbol is binary. 

B. The graph representation of LT codes 

The parity-check matrix H can also be represented by a 
bipartite graph Q = (VUC, £), where the set of variable nodes 
V represents the codeword symbol and the set of check nodes 
C represents the set of parity-check constraints satisfied by the 
codeword bits, and edges £ C {(u,c)|u g V,c 6 C}. First, let 
us briefly review conventional stopping sets in LDPC codes. 
The concept of stopping sets is proposed based on Tanner 
graph. A stopping set S in a code is a subset of the variable 
nodes in a Tanner graph for C such that all the neighbors of 
S are connected to S at least twice. 

For a given matrix Gk,n , let X = (xi,X2,...,Xk) denote 
the encoding symbols. Let T = (ti, t 2 ,t n ) denote the 
codeword. Then, X ■ Gk,n = T. In general case, the relation 
Gk, n H T = 0 is adopted to computer the parity-check matrix 
H. 

For binary linear systematic code, parity-check matrix H 
of LDPC is obtained according to Gk, n H T = 0. The matrix 
H can verify the estimation value of X = (xi,X 2 ,—,Xk) 
because LDPC is systematic code; T can be represented by 
T = (x 1 ,x 2 ,...,x k ,Pi,P2,-,Pn-k), where pi,P2,-,Pn-k 
denotes the parity bits. Thus, the encoding bits X are included 
in the transmitted bits T and are sent to the receiver. 

However, LT codes are nonsystematic codes which only 
transmit parity symbols. T can be represented by T = 
(PhP2, ■■■,Pn-k)- The transmitted symbols do not include 
the encoding symbols X. Then, the matrix H deduced from 
Gk, n H T = 0 only verifies the transmitted symbols T but 
not to verify the encoding bits X. For the sake of clarity, 
here we only concern the validity of encoding symbols X = 
(x\, X 2 ,Xk) without caring for the transmitted symbols 
T = (t\,t 2 , Therefore, the conventional solution on 

parity-check matrix H must be changed in order to suitable 
to LT codes. 

We propose a method which can create the parity-check 
matrix of LT codes; Since the transmitted bits are either 
lost or correct when the code transmits on BEC, the all 
received bits are correct. Let P = (pi,p 2 ,...,p r ) represent 
the received bits. The partitions of matrix G },„ corresponding 
to P = (pi,P2, ■■■,Pr) make up of the matrix Gk, r ■ There is, 

X • G k , r = P. (2) 

Let G{k, X(v), p(d)} denote a Fountain code ensemble, 
where k is input symbol length, X(v) is the degree distribution 
of input node, and p{d) is the degree distribution of output 
node. From the above analysis, the matrix Ck. r plays the role 
in the parity-check matrix which can verify the encoding bits 


X = (x\,X2iXk) in Fountain codes. Hence, for a particular 
code G £ G can also be represented by a bipartite graph 
G = {VU C,£}, where the set of variable nodes V represents 
k input nodes, corresponding to the input symbols. The set of 
check nodes C represents the set of parity-check constraints 
satisfied by the input symbols, corresponding to the output 
symbols,and edges £ C {(u, c)\v £ V, c G C}. 

III. Fountain Uncorrectable Set 

In this section, we analyze the decoding performance of 
Fountain code over BEC. It is known that the length of 
output symbols directly reflects the performance of iterative 
decoding algorithms. According to the above analysis, we 
build the Tanner graph of G k.n, as Shown in Fig.l. Circular 
nodes correspond to the input symbols, and the rectangular 
nodes correspond to the output symbols. There exists an edge 
between the input symbol and output symbol if and only if 
a t j = 1 , where a l:/ denotes the element of generator matrix in 
the z'th row and yth column. 

We define Fountain uncorrectable set as follows. 

Definition 1. An uncorrectable set U in Fountain codes 
represents a subset V of information nodes. The nodes directly 
connected to V will be erased. 

As shown in Fig.l, the different line type expresses an 
uncorrectable set. For the code in Fig.l, if only C\ is erased, 
the maximal uncorrectable set is U = {0}. If Ci,C 2 , and C 3 
are deleted, it means that the connected V\ and v,\ cannot 
be decoded successfully. Accordingly, the uncorrectable set is 
U = {ui,i> 4 }. 

Properties. An uncorrectable set has the following proper¬ 
ties: 

1) The union of uncorrectable sets is also an uncorrectable 
set. 

2) Each erasure pattern contains a unquie maximal uncor¬ 
rectable set which might be an empty set. 

IV. Symbol Erasure Probability 

For a particular code G in a given ensemble 
G(k, X(v), p(d)), let P b (G,e) denote the expected bit 
erasure probability if G is used to transmit over a BEC with 
erasure probability e. Let Eg( k ,\(v),p{d)) [Pb(G, e)] denote the 
probability of corresponding ensemble average bit erasure. 
Assuming the number of erasure bits is |e|, where e denotes 
the pattern of erasure. There are E{e) output node sockets 
in some arbitrary but fixed way with elements from the 
set e. Similarly, there are also input node sockets in some 
arbitrary but fixed way with elements from the set V{e). The 
element of V(e) cannot be recovered. As shown in Fig.2, 
the rectangular nodes with black correspond to the |e| lost 
output symbols. Circular nodes with black correspond to the 
V input symbols cannot be recovered because output symbols 
incident upon them are all lost. Circular nodes with gray 
correspond to the input symbols may be recovered because 
output symbols incident upon them are not all lost. 

When |e| output nodes are lost, the edges incident upon 
them are also lost. The following the number of edges con¬ 
nected to |e| output nodes lost is computed. 
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M 1 (k,L,V)=^2 coef( l[(l + yz^ k ,y v z L )(l)\. (5) 

i=i 


Let T(k,n) denote the number of the all maps with k input 
nodes connected to n output nodes. It is 


T(k,n) = j x Xj x k)L 
j=i 


( 6 ) 


Fig. 1. Fountain uncorrectable set Fig. 2. There are |e| output nodes 

lost, which lead to V input nodes 
undecodable 

Theorem 1. The probability of the number of edges with L 
connected to the set e in Fountain codes Q{k, \(v), p(d)} is: 

( dmax \ / / \ 

J[(l+xz i y* n ,xMz L ) j (f e \j (3) 

where coef (f(x),x z ) denotes the coefficient of x l in the 
polynomial f(x), and d max denotes the maximal degree of 
output nodes. Since coef(]~[f”i x (l + xz l ) Pin , x^z L ) is the 
number of sets with |e| output nodes and L edges incident 
upon them, the total numbers for selecting the pattern of 
erasure set e are (|"|). Combing the above equation, then the 
edge distribution connected to e is (|3}. 

Now, we consider the number of input nodes incident upon 
E=\E(e)\. 

Theorem 2. The average bit erasure probability for 
Fountain ensembles Q{k, X(v), p(d)} when transmitting over 
a BEC with erasure probability e is 


[Pb {fd) ^)] 



x coef( (1 + xz l ) pin ,x\ e \z L ) 

L—l i =1 

k y 

x J x P(e,L,V) 
v=i 

where P(e, L, V) denotes the probability of the uncorrectable 
set when the maximum size of uncorrectable edges reaches L , 
and the maximal size of uncorrectable set is equal to V. 

Proof: Note that for Fountain ensembles 
Q{k, X(v), p(d)}, if all edges incident upon an input 
node belong to the edges connected to e, the uncorrectable 
set of this input node is lost. Hence, this input node cannot 
be recovered. 

Assume that the set of V nodes connected with L edges 
forms the maximal uncorrectable set. Hence, the number of 
sets with V input nodes and L (0 < L < E) edges incident 



Let U be an uncorrectable set if it contains a nonempty 
subset of the variable nodes such that any regular check node 
c, which is connected to U , is connected to U at least twice. 
Obviously, there is U C V, where V is the set of variable 
set. Let C be the set that any check node, which is connected 
to C but not to U , is connected to C at least twice. There is 
CCV\U. Let K. be the maximal uncorrectable set. If every 
check node that is connected to C but not to U at least twice, 
there is C = 1C \ U. If V \ U does not contain a subset C with 
the property that every check node with is connected to C at 
least twice. Define the functions Q(k, L,V), N(k,L,V) and 
M(k,L, V) by the recursions 

Q(k, L, V) := M(k, L, V) (7) 

v>o 

N(k,L,V) := T(k,n) — Q(k,L,V) (8) 


M{k, L, V) := M 1 (k,L,V)-N(k-V,E-L, 0) (9) 

where M(k, L,V) is the number of maximal uncorrectable 
set V with E erasure edges. N(k — V, E — L, 0) denote the 
number which the remaining k — V variable nodes with the 
remaining E — L edges does not contain the uncorrectable set. 
And there are k — V variable nodes in V \ U and there are 
E — L check nodes which are not neighbors of U. We have 

( / Pm ax \ 

Th c°ef f (0 ! 

1<L ' j=1 ' 

■ N(k-V,E-L,Q). (10) 


Then, the probability which the maximal uncorrectable set 
is equal to V is P(e, L, V ) = 

It is easy to see that the probability is ( ) el e l (1 — e) n ~l e l 

\|e| / 

that pattern erasure is e. The probability that L edges are 
connected to the e is coef (Y\ d i ™l x (l+xz l ) Pin ,x\ e \z L ) j (|"|). 

Consequently, (B} holds. ■ 

In particular, when the degree of input node for Fountain 
codes is uniformity randomly distribution,the parity matrix has 
constant row weight r. The next theorem gives the bit erasure 
probability of constant row weight ensemble. 

Theorem 3. The probability of averaged bit erasure for 
Fountain ensembles Q{k,r, p(d)} when transmitting over a 
BEC with erasure probability £ is 



pQ{k,r,p(d)) \Pb{G ■> ^)] 
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Proof: For Fountain ensembles C?{fc, r, p(d)}, similarly, 
if all edges incident upon a input node belong to the edges 
connected to e, the stopping set of this input node is lost, 
hence, this input node cannot be recovered. 

Assuming the set of V nodes connected with L edges is the 
maximal uncorrectable set. Hence, the number of sets with V 
input nodes and L (0 < L < E) edges incident upon them is 



M 1 (k, L,V)=Y , coef ((! + y Vz ' ) ( 0 ! - 

1<L 

Like the proof in Theorem 2, we have 


where 


P(e,L,V) 


M(k, L, V) 
T(k , n) 


M(k, L, V) 



■N(k-V,E-L, 0). 


( 12 ) 


(13) 


(14) 


pG(k,r,p(d)) \Pbi_G) £•)] 4“ P(J(k,r,p(d)) V, \£\ ^)] 



x Y, coef ( Y[ (! + xzY^, x^z L ) 

L =1 2=1 

k y 

x E jxP(e,L,V) 

V=1 ^ 

+E sx (16) 

where Eg(k, r ,p(d)) [-S(fc — V, \£\ —L, s)] denotes the expectation 
of that the \£\—L received nodes and k — V information nodes 
have a maximal stopping set of size s. 

VI. Conclusions 

This paper proposes the concept of uncorrectable sets 
for Fountain codes as the conventional stopping set cannot 
completely model the performance of the Fountain codes, 
especially in BECs. The advantage of the proposed mechanism 
is that it allows the transmission system to analyze the perfor¬ 
mance of codes when output nodes are erased. The probability 
of averaged bit erasure over BEC is analyzed. It can help us 
design efficient codes according to channel states. In the future 
research, we will design an algorithm with low complexity to 
rapidly estimate the decoding error probability. 
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Hence, (11) holds. ■ 

V. Integrated Performance Analysis of Fountain 
Codes for BEC 

The performance of Fountain codes for BEC depends on 
two aspects. One is the essential codes structure, on which 
stopping set analysis operates. The other is the effect from 
the channel characteristic, which can be analyzed through the 
proposed uncorrectable set. 

From the Theorem 6 in 0 , the probability that Q has a 
maximal stopping set of size s is at most 

a* 

where A s (z, 0) denotes the probability that a given subset 0 
of size s of the message nodes is a stopping set, given that 
0 is a stopping set with z check nodes of degree zero and 0 
check nodes of degree one. 

Eq.dT5t represents the decoding error probability due to the 
structure. From the above analysis, the whole decoding error 
set includes: ( 1 ) the uncorrectable set due to erasure, and ( 2 ) 
the received symbols which form a stopping set. Consequently, 
the final decoding error probability is 


S(k,£,s) = 
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